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Parametric multi-channel audio representation 



The invention relates to a method of encoding a multi-channel audio signal, an 
encoder for encoding a multi-channel audio signal, an apparatus for supplying an audio 
signal, an encoded audio signal, a storage medium on which the encoded audio signal is 
stored, a method of decoding an encoded audio signal, a decoder for decoding an encoded 
S audio signal, and an apparatus for supplying a decoded audio signal. 



EP-A-1 107232 discloses a parametric coding soheme to generate a 
representation of a stereo audio signal which is composed of a left channel signal and a right 

10 channel signal. To efficiently utilize transmission bandwidth, such a representation contains 
information concerning only a monaural signal which is either the left channel signal or the 
right channel signal, and parametric information. The other stereo signal can be recovered 
based on the monaural signal together with the parametric information. The parametric 
mfennation comprises localization cues of the stereo audio signal, including intensity and 

IS phase characteristics of the left and the right channel. 



It is an object of the invention to provide a parametric multi-channel audio 
system which is able to scale the quality of the encoded audio signal with the available bit 

20 rate or to scale the quality of the decoded audio signal with the complexity of the decoder or 
the available transmission bandwidth 

A first aspect of the invention provides a method of encoding a multi-channel 
audio signal as claimed in claim 1. A second aspect of the invention provides a method of 
encoding a multi-channel audio signal as claimed in claim 2. A third aspect of the invention 

25 provides an encoder for encoding a multi-channel audio signal as claimed in claim 14. A 
fourth aspect of the invention provides an encoder for encoding a multi-channel audio signal 
as claimed in claim 15. A fifth aspect of me invention provides an apparatus for supplying an 
audio signal as claimed in claim 16. A sixth aspect of the invention provides an encoded 
audio signal as claimed in claim 17. A seventh aspect of the invention provides a storage 
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medium on which the encoded signal is stored is claimed in claim 18. An eight aspect of the 
invention provides a method of decoding as claimed in claim 19. A ninth aspect of the 
invention provides a decoder for decoding an encoded audio signal as claimed in claim 20. A 
tenth aspect of the invention provides an apparatus for supplying a decoded audio signal as 
claimed in claim 21. Advantageous embodiments are defined in the dependent claims. 

In the method of encoding a multi-channel audio signal in accordance -with the 
first aspect of the invention, a single channel audio signal is generated. Further, information 
is generated from the multi=channel audio signal allowing recovering, with a required quality 
^ev^lr^&^nw^ehaxffiel^nd 



Jnfotmation-^^Hyt <*m ^formation comprises sets of parameters, for example, as known 
fromEP=A-n07232. 

In accordance with the first aspect of the invention, the information is 
generated by detesrnimng a first portion of the information for a first frequency region of the 
multi-channel audio signal, and by determining a second portion of the information for a 
15 seoond frequency region of the multi-channel audio signal. The second frequency region is a 
portion of the first frequency region and thus is a sub-range of the first frequency region. 
Now, two levels of quality of decoding are possible. For a low quality level of the decoded 
multi-channel audio signal, the decoder uses the encoded single channel audio signal, and the 
first portion of the information. For a higher quality level, the decoder uses the encoded 
20 single channel audio signal, and both the first and the second portion of the information. Of 
course, it is possible to select the decoding quality out of a multitude of levels if a multitude 
of portions of information each being associated with a different frequency region are 
present. For example, the first portion may comprise a single set of parameters determined 
within a frequency region which covers the full bandwidth of the multichannel audio signal. 
25 And the second portion may comprise several sets of parameters, each set of parameters 
being determined for a sub-range or portion of the full bandwidth. Together, the portions 
preferably covet the full bandwidth. But many other possibilities exist For example, the first 
portion may comprise two sets of parameters, the first set being determined for a frequency 
region which covers a lower part of the full bandwidth, and the second set being determined 
30 for a frequency region covering the other part of the full bandwidth. The second portion may 
comprise two sets of parameters delmnined for two frequency regions within the lower part 
of the full bandwidth. It is not required that the number of sets of parameters for the lower 
part and the higher part of the full bandwidth are equal. 
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This jepresentation of the encoded attdio signal allows a quality of the 
decoded audio signal to depend on the complexity of the decoder. For example, in a simple 
portable decoder a low complexity decoder may be used which has a low power consumption 
and which is therefore able to use only part of the information. In a high end application, a 
complex decode* is used which uses all me information available in the coded signal. 

The quality of the decoded audio can also depend on the available 
transmission bandwidth. If the transmission bandwidth is high the decoder can decode all 
available layers, since they are all transmitted. If the transmission bandwidth is low the 
transmitter can decide to only transmit a limited number of layers. 

In a second aspect of the invention, me encoder receives a maximum 
allowable bit rate of the encoded mmti-channel audio signal. This maximum allowable bit 
rate may be denned by the available bit rate of a transmission channel such as Internet, or of 
a storage medium, m applications wherein the Hansmission bandwidm is variable and thus 
the maximum allowable bit rate changes in time, it is Important to be able to adapt to these 
fluctuations of me transmission bandwidth to prevent a very low quality of the decoded audio 
signal. Normally, the encoder encodes all available layers. It is decided at the transntittmg- 
end what layers to transmit, depending on the available channel capacity. It is possible to do 
this with the encoder in the loop, but mis is more complicated that just stripping some layers 

prior to transmission, 

The encoder only adds the second portion of the information for the second 
frequency region of the multi-channel audio signal to the encoded audio signal if a bit rate of 
the encoded multichannel audio signal Which comprises the single channel audio signal, and 
me first and second portion of the information is not higher than the maximum allowable bit 
rate. Thus, the second portion is not present in the coded audio signal if the transmission 
25 bandwidth is not large enough to support the transmission of the second portion. 

In an embodiment as defined In claim 4, the information comprises sets of 
parameters, each one of the portions of the information is represented by one or more sets of 
parameters. The number of sets of parameters depending on the number of frequency regions 

present in the portions of the information. 
30 In an embodiment as defined ha claim 6, the sets of parameters comprise at 

least one of the localization cues. 

M an embodiment as defined in claim 7, the first frequency region 
substantially covets the full bandwidth of the multi-channel audio signal. In mis way, one set 
of parameters suffices to provide the basic mformationreoniredto decode the single channel 
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audio signal into the multi-ohannel audio signal. In this way a basic level of quality of the 
decoded audio signal is guaranteed. The second frequency range covers part of the full 
bandwidth, In this way, the second portion, when present in the coded audio signal improves 
the qualify of the decoded audio signal in this frequency range. 

In an embodiment as defined in claim 8, the second portion of the information 
comprises at least two frequency ranges which together substantially cover the full bandwidth 
of the multi-channel audio signal. In this way, the quality improvement provided by the 
second portion Is present over the complete bandwidth, 

In an em bo diment - aa - defmed - fa - oiai^ 



JjO_ 



encoded audio signal. The enhancement layer which comprises the second portion of the 
information is encoded only if the bit rate of the encoded audio signal does not exceed the 
maximally allowable bit rate. In this way, the quality of the decoded audio signal will depend 
on the maximally allowable bit rate. If the maximally allowable bit rate is too low to 

1 5 accommodate the enhancement layer, the decoded audio signal will b© obtained from the 
base layer which will produce a better quality of the decoded audio than will be the case if 
unpredictable parts of the coded audio will not reach the decoder. 

In the embodiments as defined in any one of the claims 10 to 12» the portions 
of the information (usually containing sets of parameters, one set for each frequency band 

20 represented) in a next frame are coded based on the parameters of the previous frame. 

Usually, this reduces the bit rate of the encoded portions of the information, because, due to 
correlation, the information in two successive frames will not differ substantially. 

In the embodiments as defined in claim 13, the difference of the parameters of 
two successive frames is coded instead of the parameters itself. 

25 Prior solutions in audio coders that have been suggested to reduce the bit rate 

of stereo program material include intensity stereo and M/S stereo. 

In the intensity stereo algorithm, high frequencies (typically above 5 kHz) are 
represented by a single audio signal (Le., mono) combined with time-varying and frequency- 
dependent scale factors or intensity factors which allow to recover an decoded audio signal 

30 which resembles the original stereo signal for these frequency regions. In the M/S algorithm, 
the signal is decomposed into a sum (or mid, or common) signal and a difference (or side, or 
uncommon) signal, This decomposition is sometimes combined with principle component 
analysis or time-varying scale factors. These signals are then coded independently, either by 
a transform coder or sub-band coder [which are both waveform coders]. The amount of 
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information reduotion achieved by this algorithm strongly depends on Hie spatial properties 
of the source signal. For example, if the source signal is monaural, the difference signal is 
zero andean be discarded. However, if the correlation of the left and right audio signals is 
low (which is often the case for the higher frequency regions), this scheme offers only little 
5 bit rate reduction. For the lower frequency regions M/S coding generally provides significant 
merit 

Parametric descriptions of audio signals have gained interest during the last 
years, especially in the field of audio coding. It has been shown lhat transmitting (quantized) 
parameters that describe audio signals requires only little transmission capacity to re- 
10 synthesize a perceptually equal signal at the receiving end. However, current parametric 
audio coders focus on coding monaural signals, and stereo signals are processed as dual 
mono signals. 

These and other aspects ofthe invention are apparent from and will be 
elucidated with reference to the embodiments described hereinafter. 

15 

In the drawings: 

Fig. 1 shows a block diagram of a multi-channel encoder for stereo audio, 
Fig. 2 shows a block diagram of a multi-channel decoder tor stereo audio, 
20 Fig. 3 shows a representation of the encoded data stream, 

Fig. 4 shows an embodiment of the frequency ranges in accordance with the 

invention, 

Fig. 5 shows another embodiment of the frequency ranges in accordance with 

the invention, 

25 Fig, 6 shows the determmation of the sets of parameters based on parameters 

in a previous frame in accordance with an embodiment of the invention, 
Fig. 7 shows a set of parameters, 

Fig. 8 shows the differential determination of the parameters of the base layer, 

and 

30 Big. 9 shows the differential detemiinatian of the parameters corresponding to 

a frequency region of an enhancement layer. 
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Fig, 1 shows a block diagram of a multi-channel encoder. The encoder 

reoeives a multi-channel audio signal which is shown as a stereo signal RI, II and the 

encoder supplies the encoded multi-channel audio signal EBS. 

The down mixer 1 combines the stereo signal or stereo channels RI, LI into a 

single channel audio signal (also referred to as monaural signal) SC. For example, the down 

mixer 1 may determine the average of the input audio signals RI, LI. 

The encoder 3 encodes the monaural signal SC to obtain an encoded monaural 

signal ESC. The encoder 3 may be of a known kind, for example, an MPEG coder (MPEG- 

LII, MPEG-LIP (mp3 ) s or MPBG 2 -AAG) . 



10 Ihe_parameteijdeterrnmi^ 

. . , characterizing .the Information INF based on the input audio signals RI, LI. Optionally, the 
parameter detennining circuit 2 receives the maximum allowable bit rate MBR to only 
determine the parameter sets SI, S2, . . . which when coded by the parameter coder 4, together 
with the encoded monaural signal ESC do not exceed the maximum allowable bit rate MBR, 
15 The encoded parameters are denoted by EIN. 

The formatter 5 combines the encoded monaural signal SC and the encoded 
parameters EIN in a data stream in a desired format to obtain the encoded multi-channel 
audio signal EBS. 

The operation of the encoder is elucidated in more detail in the now following, 
20 by way of example, with respect to an embodiment. The multi-channel audio signal LI, RI is 
encoded in a single monaural signal SC (further also referred to as single channel audio 
signal). The parameterization of spatial attributes of the multi-channel audio signals LI, RI is 
performed by the parameter determining circuit 2. The parameters contain information on 
how to restore the multi-channel audio signal LI, RI from the monaural signal SC. The 
25 parameters are usually encoded by the parameter encoder 4 before combining them with the 
encoded single monaural signal ESC. Thus, for general audio coding applications, these 
parameters combined with only one monaural audio signal are transmitted or stored. The 
combined coded signal is the encoded multi-channel audio signal EBS. The transmission or 
storage capacity necessary to transmit or store the encoded multi-channel audio signal EBS is 
30 strongly reduced compared to audio coders that process the multi-channels independently. 
Nevertheless, the original spatial impression is maintained by the information INF which 
contains the (sets of) parameters. 
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In particular, the parametric description of multi-channel audio RI, II is 
related to a binaural processing model which aims at describing the effective signal 
processing of the binaural auditory system. 

. The model splits the incoming audio LI, RI into several band-limited signals, 
5 which, preferably, are spaced linearly at an E^mtesoale. The bar^widm of these signals 
depends on the center frequency, following me ERB-rate. Subsequently, preferably, for every 
frequency band, the following properties of the incoming signals are analyzed: 

- The interaural level difference, or ILD, defined by the relative levels of the band" 
limited signal stemming from the left and right ears, 

10 - The interaural time (or phase) difference ITD (or IPD), defined by the interaural delay 
(or phase shift) corresponding to the peak in the interaural cross-correlation function, 
and 

- The (dis)sirnilarity of the waveforms that can not be accounted for by ITDs or ILDs, 
which can be parameterized by the maximum interaural cross-correlation IC (for 

1 5 example, the value of the cross-correlation at the position of the maximum peak). 

The sets SI, S2, .. . of the three parameters, one set for each frequency band 
FR1, FR2, . . ., vary over time. However, since the binaural auditory system is very sluggish 
in its processing, the update rate of these properties is rather low (typically tens of 
milliseconds). 

20 It may be assumed that the (slowly) time-varying parameters are the only 

Spatial signal properties that the binaural auditory system has available, and that from these 
time and frequency dependent parameters, the perceived auditory world is reconstructed by 
higher levels of the auditory system. 

Fig. 2 shows a block diagram of a multi-channel decoder. The decoder 

25 receives the encoded multi-channel audio signal BBS and supplies the recovered decoded 
multi-channel audio signal which is shown as a stereo signal RO, LO. 

The deformatter 6 retrieves the encoded monaural signal ESC and the 
encoded parameters EIN' fiom the data stream EBS. The decoder 7 decodes the encoded 
monaural signal ESC into the output monaural signal SCO. The decoder 7 may be of any 

30 known kind (of course matohed to the encoder that has been used), for example, the decoder 
7 is an MPEG decoder. The decoder 8 decodes the encoded parameters BEST' into output 
parameters INO. 
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The demultiplexer 9 recovers the output stereo audio signals LQ and RO by 
applying the parameter sets SI, S2, . . . of the output parameters INO on the output monaural 
signal SCO. 

Fig. 3 shows a representation of the encoded data stream. For example, in each 
frame Fl, F2 7 ..... the data package starts with a header H followed by uie coded monaural 
Signal ECS now indicated by A, a first portion P 1 of the encoded information BIN, a second 
portion P2 of the encoded information EIN, and a third portion P3 of the encoded information 
BIN. 

If the frame Fl. F2. .. . only comprises the header H and the coded monaural 



10 signal EC S, only the monaural signal gC is transmitted. 



As disclosed in EP-A-1 107232, the full frequency band in which the input 
audio signal occurs is divided into a plurality of sub=frequency bands, which together cover 
the full frequency baud. In the terminology in accordance with the invention, the multi- 
channel information INF is encoded in a plurality of parameter sets SI, S2,. . . one set for each 
1 5 sub-frequency band FR.1 , FR2, .... This plurality of parameter sets SI, S2„. . is coded in the 
first portion Pi of the encode information BIN. Thus, to transmit a basic level quality multi- 
channel audio signal, the bit stream comprises the header H, the portion A which is the coded 
monaural signal ECS and the first portioaPl. 

In the bit stream in accordance with an embodiment of the invention, the first 
20 portion PI consists of a single set parameters SI, only. The single set being determined for 
the full bandwidth FR1 . This bit stream which comprises the header Hand the portions A and 
PI provides a basic layer of quality, indicated by BL in Fig. 3. 

To support an enhanced quality, further portions P2, P3 of the coded 
information BIN are present in the bit stream. These further portions form an enhancement 
25 layer EL. The bit stream may comprise a single further portion P2 or more than 1 further 
portion. The further portion P2 preferably comprises a plurality of sets S2, S3,, . . of 
parameters, one set for each sub-frequency band FR2, FR3, the sub- frequency bands 
FR2, FR3, ... preferably covering the full frequency band FEU. The enhanced quality may 
also be present in a step-wise manner, a first enhancement level is provided by the 
30 enhancement layer EH which comprises the first portion. And a second enhancement layer 
EL comprises the first enhancement layer ELI and the second enhancement layer BL2 which 

comprises the portion P3 . 

The further portion P2 may also comprise a single set S2 of parameters 
corresponding to a single frequency band FR2 which is a sub-band of the full fiequenoy band 
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FRl. The farther portion P2 may also comprise a number of sete of parameters S2, §3, ... 
which correspond to frequency bands FR2, FR3, ... which together do not cover the complete 

Ml frequency band FR1 . 

The further portion P3 preferably contains parameter sets for frequency "bands 
5 which sub-divide at least one of the sub-hands of the further portion P2. 

This format of the bit stream in accordance with the invention allows at the 
transmission channel, or at the decodes to scale the quality of the decoded audio signal with 
the hit rate of the transmission channel, or the decoding complexity of the decoder. For 
example, if the audio decoder should have a low power consumption, as is important in 
10 portable applications, the decoder may have a low complexity and only uses the portions H„ 
A and PI. It would even be possible that me decoder is able to perform more complex 
operations at a higher power consumption if the user indicates that he desires a higher quality 

of the decoded audio. 

It is also possible that the encoder is aware of the maximum allowable bit sate 
1 5 MBR which may be transmitted via the transmission channel or which may be stored on a 
storage medium. Now, the encoder is able to decide on how many, if any, further portions Pl s 
P2„ . . , fit within the maximum allowable bit rate MBR. The encoder codes only these 
allowable portions PI, P2, ... in the bit stream. 

Pig, 4 shows an embodiment of the frequency ranges in accordance with the 
20 invention. In this embodiment, the frequency band FRl is equal to the full bandwidth PBW 
of the multi-channel audio signal LI, RI, and the frequency band FR2 is a sub-frequency band 
of the full bandwidth FBW. 

If these are the only frequency ranges for which parameter sets SI , S2, , . . are 
determined, a single parameter set SI is determined for the frequency band FRl and is 
25 present in the portion PI, and a single parameter set S2 is determined for the frequency band 
FB2 and is present in the portion P2. The quality scaling is possible by either using or not 
using the portion P2. 

Fig. 5 shows another embodiment of the frequency ranges in accordance with 
me invention. In this embodiment, the frequency band FRl is again equal to the full 
30 bandwidth FBW, and the sub-frequency bands FR2 and FR3 together cover the full 

bandwidth FBW. Or said in other words, the frequency band FRl is subdivided into the sub- 
frequency hands FR2 and FR3. 

If these are the only frequency ranges for which, parameter sets SI, S2, ... are 
determined, the portion PI comprises a single parameter set SI determined for de frequency 
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band FRl, and the portion P2 comprise?; two parameter sets S2 and S3 determined for the 
frequency band FR2 and PR3, respectively. The quality scaling is possible by either using or 

not using the portion P2. 

Fig. 6 shows the determination of the sets of parameters based on parameters 
in a previous frame in accordance with an embodiment of the invention, 

Fig. 6 shows a data stream which comprises in each frame Fl, F2, . , . the 
coded information EIN which comprises me portion PI which is part of the base layer BL 
and the poition P2 which forms the enhancement layer EL. 
Tn thr foiTIf Fl , flip portion frl Wltt F **** * c ^ f 5 lf > nf rw numfltH M S l- wbiob 



_LQ arejjetermmedjfortheL^nJJmd^dth FRl ■ The_porlionJ?2 > -byjvmy-ofBxample J -Oomprises 

four sets of parameters S2, S3, S4, SS which are determined for the sub-frequency bands 
FR2, FR3, FR4, FRS, respectively. The four sub-frequency bands FR2, FR3, FR4, FR5 sub- 
divide the frequency band FRl . 

In the frame F2 whioh succeeds the frame Fl, the portion PI comprises a 
15 single set of parameters SI' which are determined for the full bandwidth FRl and are part of 
the base layer BL' . The portion P2 comprises four sets of parameters S2\ S3 », S4', SS* which 
are again determined for the sub-frequency bands FR2, FR3, FR4, FR5, respectively and 
which form the enhancement layer EL' . 

It is possible to code each of the sets of parameters Si, S2, ... for each one of 
20 the frames Fl, F2, ... separately. It is also possible to code the sets of parameters of the 
portion P2 with respect to the parameters of the portion PI . This is indicated by the arrows 
starting at SI and ending at S2 to S5 in the frame Fl. Of course this is also possible in the 
other frames F2, . . . (not shown). In the same manner, it is possible to code the set of 
parameters SI' with respect to SI. And finally, the sets of parameters S2', S3', S4' s S5' may 
25 be coded with respect to the sets of parameters S2, S3, S4, S5. 

la this manner, the bit rate of the encoded information EIN can be reduced as 
the redundancy or correlation between sets of parameters Si is used. 

Preferably, the new parameters of the new sets of parameters Sl'» S2', S3', 
S4% S5' are coded as the difference of their value and the value of the parameters of the 
30 previous sets of parameters SI, S2, S3, S4, S5. 

At regular time intervals, at least the parameter set SI has to be coded 
absolutely and not differential to prevent errors to propagate too long. 

Fig. 7 shows a set of parameters. Each set of parameters Si may comprise one 
or more parameters. Usually the parameters are localization cues which provide information 
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about the localization of sound objects in the audio information. Usually the localization cues 
are the interaural level difference XLB, the interaural time or phase difference ITD or IPD S 
and the interaural aoss-correlation 1C. More detailed mfomurtion on tee parameters is 
provided in the Audio Engineering Society Convention Paper 5574 "Binaural Cue Coding 
5 Applied to Stereo and Multi-channel Audio Compression' 5 presented at the 1 12 th Convention 
2002 May 1043 Munich, Germany, by Christof Faller et al. 

Fig. g shows flie differential determination of a parameter of the baa© layer. 
The horizontal axis indicates successive frames Fl to FS. The vertical axis shows the value 
PVG of a parameter of the set of parameters Si of the base layer BL. This parameter has the 
10 values Alto AS for the frames Fl to F5 respectively. The contribution of this parameter to 
the bit rate of the coded information BIN will decrease if not the actual values A2 to AS of 
the parameter are coded but the smaller differences Dl„ B2 ». . .. 

Fig. 9 shows the differential detennination of the parameters corresponding to 
a frequency region of an enhancement layer. The horizontal axis indicates two successive 
i 5 frames Fl and F2. The vertical axis indicates the values of a particular parameter of the base 
layer BL and the enhancement layer EL. In mis example, the base layer BL comprises the 
portion PI of information INF with a single set of parameters a^termined &>r the full 
frequency range FBW„ the particular parameter of the portion Fl has the value Al for the 
frame Fl and A2 for the frame F2. The enhancement layer EL comprises the portion P2 of 
20 information INF with three sets of parameters determined for three respective frequency 

ranges FR2 S FR3, FR4 which together fill me Ml frequency range FBW. The three particular 
parameters (for example, the parameter representing the IU>) have a value BU, B12 p B13 in 
the frame Fl and a value B21 s B22, B23 in the frame F2. 

The contribution of these parameters to the bit sate of the coded information 
25 EM will decrease if not the actual values Bl 1 to B23 of the particular parameter are coded 
but the differences Dll, B12,..., because these differences can be encoded more efficiently 

than the actual values. 

To summarize, in a preferred embodiment in accordance with the invention, it 
is proposed to organize me stereo parameter information INF such mat a base layer BL 
30 contains one set of parameters (preferably the time/level difference and the correlation) SI 
which is tetermined for me full bandwidth FBW of me multichannel audio signal LL FX 
The enhancement layer EL contains multiple sets of parameters S2, S3, . . . which correspond 
to subsequent frequency intervals FR2, FR3, ... within me full bandwidth FBW. For bit-rate 
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efficiency, the sets of parameters S2, S3, ... in the enhancement layer EL can he 
differentially encoded with respect to the set of parameters SI in the base layer BL. 

The information INF is encoded in a multi-layered manner to enable a sealing 
of the decoding quality versus bit rate. 

To conclude, in the now foHowing, an preferred embodiment in accordance 
with the invention is elucidated with respect to program code and its elucidation, 

First, for all subftames (the portions PI, P2, . . .) in the frames Fl, F2, . . . the 
data ESC for the monaural representation SC, the data EIN foi the set of stereo parameters SI 
*«. th» fiiii fwmiwMfo raw and the stereo nwrmrmtom S2, S3, ... for thfi ftnmif i nffiy bins for 
region s^ FR2. FR3, ... is determined, 



The program code is shown at the left hand side, and an elijoidation of the 
program code is provided under description at the right hand side. 



code 
15 { 



20 



{ 

for (f ■ 0; f < nrofjframes; f++) 
{ 

example _mono_frame(f) 



for all frames do: 

get data for monaural 
signal representation (the 
portion A in Fig. 3) 



25 



30 } 



> 



example jrtereo_extension_layer_l (f) get data stereo parameters 

Ml bandwidth (the 
portion Pi) 

example_stereo_extensionjayer_2(f) get data stereo parameters 

frequency bins (the 
portion P2) 



stereo the stereo 



Secondly, 

parameters for the full bandwidth are coded absolutely (the actual value is coded) or the 
difference with previous values is coded. The following code is valid for the interaural level 
difference TLD. 
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code 

examplc_stereo_extensLon_layer_l (f) 
{ 

reftesh_stereo 



if (refreshjrteteo = 1) 
{ 

ild ^globalKl 



15 



} 

else 
{ 

} 



1 bit denoting whether or not data is to be 
absolutely coded or not 

if data is to be coded absolutely 

code the actual inteiaural intensity 
difference(ild) for the whole frequency area 
(global) 



if not a refresh 

ild_jglobal_diff[f] code lid with respect to the previous frame 



Thirdly, depending on the value of the bit rafreshjstereo the stereo parameters 
20 for all of the frequency bins are coded absolutely (Hie actual value is coded) or the difference 
with the corresponding parameters for the full bandwidth is coded. The following code is 
valid for the intenraral level difference ILD. 



code 

25 exajnple_stereo_extensi<wvJayer_2(f) 
{ 

if(refresh_stereo=l) 



if refresh 



{ 



30 



forCb=0; b<nrof_bins: b++) for all frequency bins 
{ 

fld_bJn[f,b] code tiie ild in that bin relative to the 
global value 

} 



• L-.W OU 
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else if no refresh 

{ 

fbr(b=0; b<nrofbins; b++) for all bins 
{ 

Ud_bin_di££J£ b] code the ild -within a particular bin 

relative to the value in that bin in the 
previous frame 

} 



10 } 



Wherein: 

The term "refresh_stereo" is a flag denoting whether or not the stereo 
parameters should be refreshed (0 = FALSE, 1 = TRUE). 

The term "M_global[sf]" represents the Huf&nan encoded absolute 
1 5 representation level of the ILD for the whole frequency area for frame f. 

The term "Ud_global_diffXf]" represents the Huf&nan encoded relative 
representation level of the ILD for the whole frequency area for frame f. 

The term "ild_bin[f, b]" represents the Huffman encoded absolute 
representation level of the ILD for frame f and bin b, 
20 The term < 1ldJ?in_diff[f > b] ,s represents the Huffraan encoded relative 

representation level of the ILD for frame f and bin b. 

It should be noted that the above-mentioned embodiments illustrate rather than 
limit the invention, and that those skilled in the art will be able to design many alternative 
embodiments without departing from the scope of the appended claims. 
25 Although the invention is elucidated in the Figs, with respect to a stereo signal, 

the extension to a more than two channel audio signal can easily be accomplished by the 
skilled person, 

In the claims, any reference signs placed between parentheses shall not be 
construed as limiting the claim. The word "comprising" does not exclude the presence of 
30 elements or steps other than those listed in a claim. The invention can be implemented hy 
means of hardware comprising several distinct elements, and by means of a suitably 
programmed computer. In the device claim enumerating several means, several of these 
means can be embodied by one and the same item of hardware. The mere feet mat certain 



I 
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measures are recited in mutually different dependent claims does not indicate that a 
combination of these measures cannot be used to advantage. 

In summary, multi-channel audio signals are coded into a monaural audio 
signal and information allowing to recover the multi-channel audio signal ftom the monaural 

5 audio signal and the information. The information is generated by determining a first portion 
of the information for a first frequency region of the multi-channel audio signal, and by 
determining a second portion of the information for a second frequency region of the multi- 
ohannel audio signal. The second frequency region is a portion of the first frequency region 
and thus is a sub-range of the first frequency region. The information is multi-layered 

10 enabling a scaling of the decoding quality versus bit rate. 
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I > a method of encoding a multi-channel audio signal comprising at least two 

audio channels, the method comprising, 

generating a single channel audio signal and encoding the single channel audio 
signal into a bit stream as an encoded single channel audio signal, 
~5 generating information from tne at least two audio channels allowing to 

recover -with a iequired~quality levefthe mulri-channel"aucliosignal"fi:oni the singlechannel 

audio signal and the infonnation, the generating of the information comprising, 

determining a first portion of the Information for a first frequency region of the 
multi-channel audio signal, and encoding me first portion of the information into the bit 
10 stream as an encoded first portion of me infijrmation, and 

determining a second portion of the information for a second frequency region 
of the multi-channel audio signal, the second frequency region being a portion of the first 
frequency region, and encoding the second portion of the information into the bit stream as 
an encoded second portion of the information. 

IS 

2, A method of encoding a multi-channel audio signal comprising at least two 

audio channels, the method comprising, 

generating a single channel audio signal, 

generating information from the at least two audio channels allowing to 
20 recover with a required quality level the multi-channel audio signal from the single channel 
audio signal and the information, the generating of the information comprising, 

receiving a maximum allowable bit rate of the encoded multi-channel audio 

signal, and 

only determining a first portion of the information for a first frequency region 
25 of the multi-charmei audio signal if a tat rate of the encoded multi-channel audio signal 
comprising the single channel audio signal and the first portion of the information is not 
higher fran the maximum allowable bit rate. 
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3 . A method of encoding as claimed in claim 1 or 2, wherein the single channel 
audio signal is a particular combination of the at least two audio channels. 

4, A method of encoding as claimed in claim 1 , characterised in that the 

5 inforrnation comprises sets of parameters, the first portion comprises at least a first one of the 
sets of parameters, the second portion comprises at least a second one of the sets of 
parameters, wherein each set of parameters is associated with a corresponding frequency 
region. 

10 5. A method of encoding as claimed in claim 4, characterized in that the sets of 

parameters comprise at least one localization cue. 

6. A method of encoding as claimed in claim S, characterized in mat the at least 
one localization cue is selected from: an inter aural level difference, an interaural time or 

IS phase difference, or an interaural cross-correlation. 

7. A method of encoding as claimed in claim 1 or 2, characterized in that the first 
frequency region covers a full bandwidth of the multi-channel audio signal. 

20 8. A method of encoding as claimed in claim 1, characterized in that the first 

frequency region substantially covers a full bandwidth of the multi-channel audio signal, the 
second frequency region covers a portion of the full bandwidth, and in that the determining of 
the second portion of the information is adapted to determine sets of parameters for both the 
second frequency region and a set of further frequency regions, the second frequency region 

25 and the set of further frequency regions substantially covering the full bandwidth, where in 
the set of further frequency regions comprises at least one further frequency region. 

9. A method of encoding as claimed in claim 8, characterized in that the single 

channel audio signal and the first portion of the information form a base layer of information 
30 which is always present in the encoded multi-channel audio signal, and in that the method 
comprises receiving a maximum allowable bit rate of the encoded multi-channel audio signal, 
the second portion of the information forming an enhancement layer of information which is 
encoded only if the bit rate of the encoded base layer and enhancement layer is not higher 
than the maximum allowable bit rate. 
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10. A method of encoding as claimed in claim 4, characterized in that the 

detetmining of the first portion of irrformatlon in a particular frame of encoded information 
comprises determining the first one of the sets of parameters in the particular frame, and 
coding the first one of the sets of parameters based on the first one of the sets of parameters 
of a frame preceding the particular frame. 

H. A method of encoding as claimed in claim 8, characterized inthat the 

fl etenn fo io g ^.rmA p o t*™. of ^formation in a particular frame O f j hejmCPiteA. 



20 



2S 



JJ3__^oimalii3njmrnrjto tH-ht^"? gets of parameters of the second portion in the 

particular frame and coding the sets of parameters of the second portion in the particular 
frame based on the sets of parameters of a frame preceding the particular frame. 

12, A method of encoding as claimed in claim 8, characterized in that the 

IS determining of the second portion of information in a particular frame of the encoded 
information comprises detainiining the sets of parameters of the second portion in the 
particular frame and coding the sets of parameters of the second portion in the particular 
frame based on the first one of the sets of parameters of a frame preceding the particular 
frame. 



13i a method of encoding as claimed in any one of the claims 10 to 12, 

characterized in that the deterrmning comprises calculating a difference between the 
corresponding parameters in the particular frame and the frame preceding Hie particular 
frame. 



14. An encoder for coding a multi-channel audio signal comprising at least two 

audio channels, the encoder comprising: 

means for generating a single channel audio signal, 
means for generating information from the at least two audio channels 
30 allowing to recover with a required quality level the multi-channel audio signal from the 

single channel audio signal and the information, the generating of the information 

comprising, 

means for determining a first portion of the information for a first frequency 
region of the multi-channel audio signal, and 
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means for determining a second portion of the information for a second 
frequency region of the multi-channel audio signal, the second frequency region being a 
portion of the first frequency region. 



5 



15. 



An encoder for encoding a multi-channel audio signal comprising at least two 



audio channels, the encoder comprising, 

means for generating a single ohannel audio signal, 

means for generating information from the at least two audio channels 

allowing to recover with a required quality level the multi-channel audio signal from the 
10 single channel audio signal and the information, the generating of the information 

comprising, 

means for receiving a maximum allowable bit rate of the encoded multi- 
channel audio signal, and 

means for only determining a first portion of the information for a first 
15 frequency region of the multi-channel audio signal if a bit rate of the encoded multi-channel 
audio signal comprising the single channel audio signal and the first portion of the 
information is not higher than the maximum allowable bit rate. 

16. An apparatus for supplying an audio signal, the apparatus comprising: 

20 an input for receiving an audio signal, 

an encoder as claimed in claim 14 or 1 5 for encoding the audio signal to obtain 
an encoded audio signal, and 

an output for supplying the encoded audio signal. 

25 17. An encoded audio signal comprising: 

a single channel audio signal, 

information from the at least two audio channels allowing to recover with a 
required quality level the multi-channel audio signal from the single channel audio signal and 
the information, the information comprising, 
30 a first portion of the information for a first frequency region of the multi- 

channel audio signal, and 

a second portion of the information for a second frequency region of the multi- 
channel audio signal, the second frequency region being a portion of the first frequency 
region. 
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X8. A storage medium on which the encoded audio signal as claimed in claim 17 

has been stored. 

5 X9. A method of decoding a multi-channel audio signal being encoded as claimed 

in claim 17, the method of decoding comprising: 

obtaining a decoded single channel audio signal, 

obtaining decoded information from the information allowing to recover the 
rfgnrf fa»m the decoded sinftlft channel audio pipnal and the decoded 



jflfrrmati ^ fha decoded information co mprises the fir st portion of the information and the — 

second portion of the information, and 

applying either the first portion of the intbrmation or the first portion and the 
second portion of the information on the single channel audio signal to generate the decoded 
multi-channel audio signal. 

IS 

20. A decoder for decoding an encoded audio signal, the decoder comprising: 
means for obtaining a decoded single channel audio signal, 

means for obtaining decoded information from the information allowing to 
recover the multi-channel audio signal from the decoded single ohannal audio signal and the 
20 decoded information, the decoded information comprises the first portion of the information 
and the second portion of the information, and 

means for applying the first portion of the information and the second portion 
of the information on the single channel audio signal to generate the decoded multi-channel 
audio signal. 

25 

21 . An apparatus for supplying a decoded audio signal, the apparatus comprising; 
an input for receiving an encoded audio signal, 

a decoder as olaimed in claim 20 for decoding the encoded audio signal to 

Obtain a multi-channel output signal, and 
30 an output for supplying or reproducing the multi-channel oulput signal. 
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ABSTRACT: 



Multi-Channel audio signals are coded into a monaural audio signal and 
Information allowing to recover the multi-channel audio signal from the monaural audio 



information for a first frequency region of the multi-chaanel audio signal, and by delrnnining 
a second portion of the information for a second frequency region of the multi-channel audio 
signal. The second frequency region is a portion of the first frequency region and thus is a 
sub-range of the first frequency region. The information is multi-layered enabling a scaling of 
the decoding quality versus bit rate. 



signal and the information. The information is generated by determining a first portion of the 



(Fig. 6) 
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